GH-36411: [Python] Use scikit-build-core as build backend for PyArrow and get rid of setup.py#49259
GH-36411: [Python] Use scikit-build-core as build backend for PyArrow and get rid of setup.py#49259raulcd wants to merge 19 commits intoapache:mainfrom
Conversation
This comment was marked as off-topic.
This comment was marked as off-topic.
|
@github-actions crossbow submit -g python |
This comment was marked as outdated.
This comment was marked as outdated.
|
@github-actions crossbow submit wheel-*-cp313-cp313-amd64 |
This comment was marked as outdated.
This comment was marked as outdated.
|
@github-actions crossbow submit wheel-windows-cp313-cp313-amd64 |
|
Revision: 166ff63 Submitted crossbow builds: ursacomputing/crossbow @ actions-f1bbdf4eaa
|
|
@github-actions crossbow submit wheel-windows-cp313-cp313-amd64 |
|
Revision: 36fefd4 Submitted crossbow builds: ursacomputing/crossbow @ actions-bd811d95ea
|
|
@github-actions crossbow submit -g python -g wheel |
This comment was marked as outdated.
This comment was marked as outdated.
|
@github-actions crossbow submit python-sdist |
|
Revision: c518c90 Submitted crossbow builds: ursacomputing/crossbow @ actions-ea249e92ce
|
|
@github-actions crossbow submit python-sdist |
|
Revision: 131e2c0 Submitted crossbow builds: ursacomputing/crossbow @ actions-1c58e78aff
|
|
@github-actions crossbow submit python-sdist |
|
Revision: 30e04be Submitted crossbow builds: ursacomputing/crossbow @ actions-3179f97956
|
|
@github-actions crossbow submit -g python -g wheel |
This comment was marked as outdated.
This comment was marked as outdated.
|
@github-actions crossbow submit wheel-windows-cp313-cp313-amd64 |
|
Revision: 5606749 Submitted crossbow builds: ursacomputing/crossbow @ actions-3c4293e220
|
…re CMAKE_BUILD_TYPE isn't populated
…emove usage of PYARROW_CMAKE_GENERATOR
…tings cmake.build-type usage
…ngs cmake.args usage
…s and update usage. Also document --config-settings cmake.build-type
cdd7629 to
1110dbe
Compare
|
@github-actions crossbow submit -g python -g wheel |
This comment was marked as outdated.
This comment was marked as outdated.
|
@github-actions crossbow submit wheel-cp313 |
|
Revision: 75d2f7f Submitted crossbow builds: ursacomputing/crossbow @ actions-fda9c59efb |
|
@github-actions crossbow submit -g python -g wheel |
|
Revision: 75d2f7f Submitted crossbow builds: ursacomputing/crossbow @ actions-62c2fce38f |
WillAyd
left a comment
There was a problem hiding this comment.
Nice work - some small comments but generally this lgtm
| - Number of processes used to compile PyArrow’s C++/Cython components | ||
| - ``''`` | ||
|
|
||
| Note that ``pip install`` uses ``--config-settings`` (plural) while |
There was a problem hiding this comment.
I don't remember this nuance from the meson-python branch. Looks like there we just use -C to pass arguments to the front-end, whether its pip or build. Maybe we should use that here and avoid this altogether?
There was a problem hiding this comment.
that's not a bad tip. Thanks @WillAyd ! Let me try to see if -C works for both! I found it pretty annoying that the API was different.
| try: | ||
| yield | ||
| finally: | ||
| if sys.platform != "win32": |
There was a problem hiding this comment.
So these edits are made to the source tree directly right? No way to do this in an isolated build location?
There was a problem hiding this comment.
I am unsure how we could manage those. We want to do non isolated builds to build against specific numpy/pandas versions, mandating isolated builds creates complexity on environments and dependency management. If we try to copy ourselves on a different location to build we are also creating a bunch of complexity for a really simple thing, having licenses copied! Do you have any suggestions on what I could try?
To be fair I've found the whole licenses / notice and how build backends manages them to be too tight and inflexible leaving cases like ours on a difficult spot. I have been thinking on just copying the files but maintenance is a burden just for keeping them in sync. Maybe we should copy them and have a CI check that validates they are in-sinc?
I feel like a really simple build backend that just copies the data, either by moving soft-links to hard links or, in the case of Windows, copying the files is the simplest solution.
There was a problem hiding this comment.
Yea I agree with your overall sentiment - sadly its a lot of effort to maintain these two files :-)
I think the meson-python approach was pretty good, where a dist script copied these into the source distribution at the time it was being made. I'm wondering if scikit-build-core has a similar hook
Rationale for this change
Move our PyArrow build backend from setuptools and a custom setup.py to scikit-build-core which is just build backend for CMake related projects.
What changes are included in this PR?
Move from setuptools to scikit-build-core and remove PyArrow setup.py. Update some of the build requirements and minor fixes.
A custom build backend has been also been created in order to wrap scikit-build-core in order to fix problems on License files for monorepos.
pyproject.toml metadata validation expects license files to exist before exercising the build backend that's why we create symlinks. Our thin build backend will just make those symlinks hard-links in order for license and notice files to contain the contents and be added as part of the sdist.
Remove flags that are not used anymore (were only part of setup.py) and documented and validated how the same flags have to be used now.
Are these changes tested?
Yes all Python CI tests, wheels and sdist are successful.
Are there any user-facing changes?
Yes, users building PyArrow will now require the new build dependencies to exercise the build and depending on the flags used they might require to use the new documented way of using those flags.